Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech: Speech Understanding / Interaction
نویسندگان
چکیده
Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker identification system. An F-ratio feature analysis was conducted on both the LAR and MFCC feature vectors which showed the lower order LAR coefficients are superior to MFCC counterpart. The textindependent, closed-set speaker identification rate, as tested on the downsampled version of TIMIT database, was improved from 96.73%, using the MFCC feature, to 98.81%, using the LAR features.
منابع مشابه
Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملSpeech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کاملRobust text-independent speaker identification using Gaussian mixture speaker models
This paper introduces and motivates the use of Gaussian mixture models (CMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are efTective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance ...
متن کاملText-independent speaker identification using Gaussian mixture bigram models
In this paper, a novel speaker modeling technique based on Gaussian mixture bigram model (GMBM) is introduced and evaluated for text-independent speaker identification (speaker-ID). GMBM is a stochastic framework that explores the context or time dependency of continuous observations from an information source. In view of the fact that speech features are correlated between successive frames, w...
متن کامل